GME VARCO VISION Embedding
GME-VARCO-VISION-Embedding is a multimodal embedding model that focuses on calculating the semantic similarity between text, images, and videos in a high-dimensional embedding space, and is particularly good at video retrieval tasks.
Multimodal Fusion
Transformers English